On Consequential Validity
نویسندگان
چکیده
Free AccessOn Consequential ValidityDragos Iliescu and Samuel GreiffDragos IliescuDragos Iliescu, Faculty of Psychology Educational Sciences, University Bucharest, Sos. Panduri 90, 050657 Romania, E-mail [email protected] Romania Search for more papers by this author Greiff Institute Cognitive Science Assessment (COSA), Luxembourg, Luxembourg authorPublished Online:June 03, 2021https://doi.org/10.1027/1015-5759/a000664PDF ToolsAdd to favoritesDownload CitationsTrack Citations ShareShare onFacebookTwitterLinkedInRedditE-Mail SectionsMoreProblem Statement Intention This EditorialOur understanding validity has evolved from what seemed at that time innovative, which is now interesting merely a historical perspective (Camara & Brown, 1995). Most the standard steps in evolution assessments are well-known, taught students, applied professionals, everyday knowledge repertoire psychometricians: “validity correlation test scores with some criterion” degree measures it purports measure,” then later “all construct validity” finally refers evidence theory support interpretations entailed proposed uses tests” (AERA, APA, NCME, 1999, p. 9). Sireci (2012) discussed these evolutions broke them down into three eras theory: empirical era (starting around 1900 focused primarily on criterion-related factor analysis), theoretical 1920 analysis validity), practical 1970s advancing argument-based approach).But potentially disrupting proposals have not made impact they could have. One concept consequential validity. In editorial, we draw attention outlining proposes, how an better – encouraging development stronger research important direction, EJPA would be thrilled host.Samuel Messick His View ValidityConsequential been (1989) as integral part argumentation test, pointing need investigate social consequences (both actual potential) testing. Although critics noted years “there no agreement present about importance or meaning term (Mehrens, 1997, 17), note here exact definition Messick: aspect (construct) “appraises value implications score interpretation basis action well potential use, especially regarding sources invalidity related issues bias, fairness, distributive justice” (Messick, 1995, 745).Messick two dimensions he called “the validity.” First, labels accurate descriptions, order guide stakeholders. At first glance, classical approach after all, if should measure intends measure, developer all lucidly label measure. was concerned oversimplification professional slang labeling produce wrong stakeholders other audiences. For instance, “resilience quotient” may lead assume resilience unidimensional, despite literature showing multidimensional construct. Second our discussion developers users appraise using respective test. component extended Shepard (1993, 1997) include both intended unintended use.Consequential attracted different levels depending specific fields. It sparked little enthusiasm clinical occupational testing but probably educational domain than any usage (Koretz, 2008). settings, connected schools can policies school funding, curriculum instructional content, others. To day, is, however, one concepts least utilized (Cizek et al., 2008) possibly still most debated (Lees-Haley, 1996). When offering information efforts will convergent information, evaluating whether assessment samples content (of knowledge, behaviors, processes) consistently, fairly, authentically 1989; Wiliam, 2000), oftentimes completely ignoring information. We argue effect definitional bias direct result define test,” where boundaries between test-related processes, components we, therefore, take responsibility. believe tests force good world improve researchers develop awareness toward dedicate their aspects validity.Examples How Validity Can Be ConceivedThere many ways conceive Let us look examples. As example, consider achievement tests, used widely settings (classrooms schools). They generate data teachers, parents, considered management. A student correctly, covering target generating reliable competence assessed student. These highly predictive individual outcomes, such grades high-stakes exams, admission higher education, dropout. looks like perfect picture valid But sometimes administration also outcomes: negative feedback performance course demotivation, embrace courses areas learning, drop out school; mere existence motivate teachers teach broader capacity developed. This, its entity, begs question only judged data, consequences? If based consequences, responsibility these. so, very “test” shift variables processes consequences; case, become conceived.As second cognitive ability personality settings. decision-makers, hiring managers, tools personnel selection. Imagine consultant who sets up selection system validities computed sample employees. Test managers make decisions. year later, engages follow-up study investigates system, find decisions far lower predicted predictor-criterion correlations. due fact do understand reports enough build decision making, therefore either ignore scores, expose themselves selectively results, misconstrue meaning, thus leading mis-decisions avoided. Should show correlations criteria, organization? terms decisions, “tests” case lucid specialized training.Consequential easily integrated discussions professionals adopt does apply instruments inferences 2008; Messick, 1989). current 2014), collection stemming reasoning, specifically, “evidence [that] test” (p. 11). corpus appear of, results factored in. However, embracing note, crux ardent critiques validity: measurement quality fundamentally different. Mehrens (1997) put it: “This confounding treatment efficacy (or decision-making wisdom) seems unwise me.” 17); goes describe seen medical profession: medic takes temperature patient, he/she measurement.Here, wish avoid debate (albeit acknowledging importance); instead, angle urges because ever-changing nature elements adopted definition. Indeed, through narrow increasingly encompassing. adhere encourage increase legitimately intrinsic adherence narrower shape psychometric practice motivating usage.The traditional has, unfortunately, narrow, items: procedure method comprises set standardized items (e.g., stimuli, questions, tasks) scored manner examine evaluate differences emotions, cognitions, attitudes, skills, abilities, competencies) (Anastasi Urbina, 1997; Cronbach, 1990). Arguments favor (Iliescu, 2017), encompassing among others: technical manual, training materials (including delivery sessions), certification scheme process), design technology generates reports), protection access qualified intellectual property), published available users. list certainly considerations given received built individuals, groups, communities, society. Embracing “a implicitly relate testing.What ProposeWe, propose supplementary facets substantive understudied areas. end, raise topic EJPA, explicitly welcome studies validity, addressing example:Report development: (automated not) psychological constructed maximum stakeholders;Feedback scores: form effective stakeholders;Individual testing: effects amplified, respectively mitigated;Social program contributes change society.All questions field discuss contribute Extant unexpectedly scarce, strongly robust positive change, position society.References American Research Association [AERA], Psychological [APA], National Council Measurement Education [NCME]. (1999). Standards testing, AERA. First citation articleGoogle Scholar (2014). Anastasi, A., S. (1997). (7th. ed.). Prentice Hall. Camara, W. J., D. C. (1995). employment Changing policy. Measurement: Issues Practice, 14(1), 5–11. https://doi.org/10.1111/j.1745-3992.1995.tb00845.x articleCrossref, Google Cizek, G. Rosenbert, L., Koons, H. (2008). Sources tests. Measurement, 68(3), 397–412. https://doi.org/10.1177/0013164407310130 L. J. (1990). Essentials (5th Harper Collins. (2017). Adapting linguistic cultural situations, Cambridge Press. Koretz, Measuring up: What really tells us, Harvard Lees-Haley, P. (1996). Alice validityland, dangerous Psychologist, 51(9), 981–983. https://doi.org/10.1037/0003-066X.51.9.981 Mehrens, A. The 16(2), 16–18. (1989). Validity. R. LinnEd., (3rd ed., pp. 13–103). ACE NCME. assessment: Validation persons’ responses performances scientific inquiry meaning. 50, 741–749. https://doi.org/10.1037/0003-066X.50.9.741 Shepard, (1993). Evaluating Darling-HammondEd., Review education (pp. 405–450). centrality use 5–24. https://doi.org/10.1111/j.1745-3992.1997.tb00585.x Sireci, (2012, July 5). learned 100 validation? State-of-the-Art address delivered 8th Annual Conference International Commission, Amsterdam, Netherlands. (2000). meanings assessments. Critical Quarterly, 42(1), 105–127. ScholarFiguresReferencesRelatedDetails Volume 37Issue 3May 2021ISSN: 1015-5759eISSN: 2151-2426 Published onlineJune 3, 2021 InformationEuropean Journal (2021), 37, 163-166 https://doi.org/10.1027/1015-5759/a000664.© 2021Hogrefe Publishing
منابع مشابه
Consequential Validity of the Implicit Association Test
Numeric values of psychological measures often have an arbitrary character before research has grounded their meanings, thereby providing what S. J. Messick (1995) called consequential validity (part of which H. Blanton and J. Jaccard, 2006, this issue, now identify as metric meaningfulness). Some measures are predisposed by their design to acquire meanings easily, an example being the sensitiv...
متن کاملThe consequential validity of ABFM examinations.
Measurement scholar Samuel Messick, defines validity as “an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores. . . ” (p. 13). Messick’s definition of validity differed from those of previous validity theorists in that he acknowledged that test scores often af...
متن کاملOn the Consequential Validity of ESP Tests: A Qualitative Study in Iran
Consequential validity, a component of construct validity as a unified concept introduced by Messick(1989), deals with the impacts of tests on teaching, learning, individual test takers, teachers, society, and educational system within a country. Although the impacts of language tests on teaching and learning have been somehow studied, the consequences of ESP tests on individual test taker...
متن کاملOral Reading Fluency Assessment: Issues of Construct, Criterion, and Consequential Validity
متن کامل
Consequential Implications of Municipal Energy System on City Carbon Footprints
Climate change mitigation is an important goal for cities globally. Energy production contributes more than half of the global greenhouse gas emissions, and thus the mitigation potential of local municipal energy systems is important for cities to recognize. The purpose of the study is to analyze the role of local municipal energy systems in the consumption-based carbon footprint of a city resi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: European Journal of Psychological Assessment
سال: 2021
ISSN: ['1015-5759', '2151-2426']
DOI: https://doi.org/10.1027/1015-5759/a000664